Ambiguous grammar

In computer science, a context-free grammar is said to be an ambiguous grammar if there exists a string that can be generated by the grammar in more than one way (i.e., the string admits more than one parse tree or, equivalently, more than one leftmost derivation). A context-free language is inherently ambiguous if all context-free grammars generating that language are ambiguous.

Some programming languages have ambiguous grammars; in this case, semantic information is needed to select the intended parse tree of an ambiguous construct. For example, in C the following:

x * y ;

can be interpreted as either:

the declaration of an identifier named y of type pointer-to-x, or
an expression in which x is multiplied by y and then the result is discarded.

To correctly choose between the two possible interpretations, a compiler must consult its symbol table to find out whether x has been declared as a typedef name that is visible at this point.

1 Example
2 Recognizing ambiguous grammars
3 Inherently ambiguous languages
4 References
5 External links

Example

The context free grammar

A → A + A | A − A | a

is ambiguous since there are two leftmost derivations for the string a + a + a:

A	→ A + A	A	→ A + A
	→ a + A		→ A + A + A (First A is replaced by A+A. Replacement of the second A would yield a similar derivation)
	→ a + A + A		→ a + A + A
	→ a + a + A		→ a + a + A
	→ a + a + a		→ a + a + a

As another example, the grammar is ambiguous since there are two parse trees for the string a + a − a:

The language that it generates, however, is not inherently ambiguous; the following is a non-ambiguous grammar generating the same language:

A → A + a | A − a | a

Recognizing ambiguous grammars

The general question of whether a grammar is not ambiguous is undecidable. No algorithm can exist to determine the ambiguity of a grammar because the undecidable Post correspondence problem can be encoded as an ambiguity problem. At least, there are tools implementing some semi-decision procedure for detecting ambiguity of context-free grammars, see e.g. (Axelsson, Heljanko & Lange 2008).

There is an obvious difficulty in parsing an ambiguous grammar by a deterministic parser (see deterministic context-free grammar) but nondeterministic parsing imposes a great efficiency penalty. Most constructs of interest to parsing can be recognized by unambiguous grammars. Some ambiguous grammars can be converted into unambiguous grammars, but no general procedure for doing this is possible just as no algorithm exists for detecting ambiguous grammars. Compiler generators such as YACC include features for disambiguating some kinds of ambiguity, such as by using the precedence and associativity constraints.

Inherently ambiguous languages

Inherent ambiguity was first shown to exist in 1961 by Rohit Parikh in an MIT research report Parikh (1961) and later a journal version Parikh (1966)).

While some context-free languages (the set of strings that can be generated by a grammar) have both ambiguous and unambiguous grammars, there exist context-free languages for which no unambiguous context-free grammar can exist. An example of an inherently ambiguous language is the union of $\{a^n b^m c^m d^n | n, m > 0\}$ with $\{a^n b^n c^m d^m | n, m > 0\}$ . This set is context-free, since the union of two context-free languages is always context-free. But Hopcroft & Ullman (1979) give a proof that there is no way to unambiguously parse strings in the (non-context-free) subset $\{a^n b^n c^n d^n | n > 0\}$ which is the intersection of these two languages.

References

Parikh, Rohit. (January 1961). Language-generating devices. Quarterly Progress Report, Research Laboratory of Electronics, MIT.

Gross, Maurice. (1964). Inherent ambiguity of minimal linear grammars. 7:3. Information and Control. pp. 366–368.

Parikh, Rohit. (1966). On Context Free Languages. 13. Journal of the Association for Computing Machinery. pp. 570–81.

Michael, Harrison. (1978). Introduction to Formal Language Theory. Addison-Wesley.

Hopcroft, John E.; Ullman, Jeffrey D. (1979). Introduction to Automata Theory, Languages, and Computation (1st ed.). Addison-Wesley.

Hopcroft, John.; Mowani, Rajeev.; Ullman, Jeffrey. (2001). Introduction to Automata Theory, Languages and Computation (second edition). Addison Wesley. pp. 217.

Axelsson, Roland; Heljanko, Keijo; Lange, Martin (2008). "Analyzing Context-Free Grammars Using an Incremental SAT Solver". Proceedings of the 35th International Colloquium on Automata, Languages and Programming (ICALP'08), Reykjavik, Iceland. Lecture Notes in Computer Science. 5126. Springer-Verlag. pp. 410–422.

Brabrand, Claus; Giegerich, Robert; Møller, Anders (2010). "Analyzing Ambiguity of Context-Free Grammars". Science of Computer Programming. 75. Elsevier. pp. 176–191.

External links

dk.brics.grammar - a grammar ambiguity analyzer.
CFGAnalyzer - tool for analyzing context-free grammars with respect to language universality, ambiguity, and similar properties.